智能论文笔记

Sparse Ellipsometry: Portable Acquisition of Polarimetric SVBRDF and Shape with Unstructured Flash Photography

Inseung Hwang , Daniel S. Jeon , Adolfo Muñoz , Diego Gutierrez , Xin Tong , Min H. Kim

分类：计算机视觉

2022-07-09

椭圆测量技术允许测量材料的极化信息，需要具有不同灯和传感器配置的光学组件的精确旋转。这会导致繁琐的捕获设备，在实验室条件下仔细校准，并且在很长的获取时间，通常按照每个物体几天的顺序。最近的技术允许捕获偏振偏光的反射率信息，但仅限于单个视图，或涵盖所有视图方向，但仅限于单个均匀材料制成的球形对象。我们提出了稀疏椭圆测量法，这是一种便携式偏光获取方法，同时同时捕获极化SVBRDF和3D形状。我们的手持设备由现成的固定光学组件组成。每个物体的总收购时间在二十分钟之间变化，而不是天数。我们开发了一个完整的极化SVBRDF模型，其中包括分散和镜面成分以及单个散射，并通过生成模型来设计一种新型的极化逆渲染算法，并通过数据增强镜面反射样品的数据增强。我们的结果表明，与现实世界对象捕获的极化BRDF的最新基础数据集有很强的一致性。

translated by 谷歌翻译

DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers

Mustafa B. Yaldiz , Andreas Meuleman , Hyeonjoong Jang , Hyunho Ha , Min H. Kim

分类：计算机视觉

2022-06-16

基金标记已广泛用于识别可以通过相机检测到的对象或嵌入式消息。主要是，现有的检测方法假设标记印刷在理想的平面表面上。由于光学/透视失真和运动模糊的各种成像伪像，标记通常无法识别。为了克服这些局限性，我们提出了一个新型的可变形基准标记系统，该系统由三个主要部分组成：首先，基准标记生成器会创建一组自由形式的颜色模式，以在唯一的视觉代码中编码大量的大规模信息。其次，一个可区分的图像模拟器创建了具有变形标记的影像现实主义场景图像的训练数据集，并在优化期间以可区分的方式渲染。渲染的图像包括带有镜面反射，光学失真，散焦和运动模糊，颜色改变，成像噪声以及标记的形状变形的逼真的阴影。最后，训练有素的标记探测器寻求感兴趣的区域，并通过反变形转换同时识别多个标记模式。可变形的标记创建者和探测器网络以端到端的方式通过可区分的光真逼真的渲染器共同优化，使我们能够以高精度来稳健地识别广泛的可变形标记。我们的可变形标记系统能够在〜29 fps中成功解码36位消息，并具有严重的形状变形。结果验证了我们的系统明显优于传统和数据驱动的标记方法。我们基于学习的标记系统打开了基准标记的新有趣应用，包括对人体的成本效益运动捕获，使用我们的基金标记阵列作为结构化的光模式进行主动3D扫描，以及强大的增强现实对象的虚拟物体在动态上进行虚拟对象渲染表面。

translated by 谷歌翻译

Multimodal Wildland Fire Smoke Detection

Siddhant Baldota , Shreyas Anantha Ramaprasad , Jaspreet Kaur Bhamra , Shane Luna , Ravi Ramachandra , Eugene Zen , Harrison Kim , Daniel Crawl , Ismael Perez , Ilkay Altintas

分类：计算机视觉

2022-12-29

Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency to provide tools for effective wildfire management. Early detection of wildfires is essential to minimizing potentially catastrophic destruction. In this paper, we present our work on integrating multiple data sources in SmokeyNet, a deep learning model using spatio-temporal information to detect smoke from wildland fires. Camera image data is integrated with weather sensor measurements and processed by SmokeyNet to create a multimodal wildland fire smoke detection system. We present our results comparing performance in terms of both accuracy and time-to-detection for multimodal data vs. a single data source. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.

translated by 谷歌翻译

DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion Probabilistic Model

Jeongjun Choi , Dongseok Shim , H. Jin Kim

分类：计算机视觉

2022-12-06

Thanks to the development of 2D keypoint detectors, monocular 3D human pose estimation (HPE) via 2D-to-3D uplifting approaches have achieved remarkable improvements. Still, monocular 3D HPE is a challenging problem due to the inherent depth ambiguities and occlusions. To handle this problem, many previous works exploit temporal information to mitigate such difficulties. However, there are many real-world applications where frame sequences are not accessible. This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection. Rather than exploiting temporal information, we alleviate the depth ambiguity by generating multiple 3D pose candidates which can be mapped to an identical 2D keypoint. We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector. By considering the correlation between human joints by replacing the conventional denoising U-Net with graph convolutional network, our approach accomplishes further performance improvements. We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets. Comprehensive experiments are conducted to prove the efficacy of the proposed method, and they confirm that our model outperforms state-of-the-art multi-hypothesis 3D HPE methods.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Decentralized Deadlock-free Trajectory Planning for Quadrotor Swarm in Obstacle-rich Environments -- Extended version

Jungwon Park , Inkyu Jang , H. Jin Kim

分类：机器人

2022-09-20

本文介绍了一个分散的多代理轨迹计划（MATP）算法，该算法保证在有限的沟通范围内在障碍物丰富的环境中生成安全，无僵硬的轨迹。所提出的算法利用基于网格的多代理路径计划（MAPP）算法进行僵局，我们引入了子目标优化方法，使代理会收敛到从MAPP生成的无僵局生成的路点。此外，提出的算法通过采用线性安全走廊（LSC）来确保优化问题和避免碰撞的可行性。我们验证所提出的算法不会在随机森林和密集的迷宫中造成僵局，而不论沟通范围如何，并且在飞行时间和距离方面的表现都优于我们以前的工作。我们通过使用十个四肢的硬件演示来验证提出的算法。

translated by 谷歌翻译

Don't Judge a Language Model by Its Last Layer: Contrastive Learning with Layer-Wise Attention Pooling

Dongsuk Oh , Yejin Kim , Hodong Lee , H. Howie Huang , Heuiseok Lim

分类：自然语言处理 | 人工智能

2022-09-13

最近的预训练的语言模型（PLM）通过学习语言特征和上下文化的句子表示，在许多自然语言处理任务上取得了巨大成功。由于未清楚地识别出在PLM的堆叠层中捕获的属性，因此通常首选嵌入最后一层的直接方法，而不是从PLM中得出句子表示。本文介绍了基于注意力的合并策略，该策略使该模型能够保留每一层中捕获的图层信号，并学习下游任务的消化语言特征。对比度学习目标可以使层面上的注意力汇集到无监督和监督的举止。它导致预先训练嵌入的各向异性空间并更均匀。我们评估我们的模型关于标准语义文本相似性（STS）和语义搜索任务。结果，我们的方法改善了基础对比度的BERT_BASE和变体的性能。

translated by 谷歌翻译

Interference Cancellation GAN Framework for Dynamic Channels

Hung T. Nguyen , Steven Bottone , Kwang Taik Kim , Mung Chiang , H. Vincent Poor

分类：机器学习 | 人工智能

2022-08-17

符号检测是现代通信系统中的一个基本且具有挑战性的问题，例如多源多输入多输出（MIMO）设置。迭代软干扰取消（SIC）是该任务的最新方法，最近动机的数据驱动的神经网络模型，例如深度，可以处理未知的非线性通道。但是，这些神经网络模型需要在应用之前对网络进行全面的时间量培训，因此在实践中不容易适合高度动态的渠道。我们介绍了一个在线培训框架，该框架可以迅速适应频道中的任何更改。我们提出的框架将最近的深层发展方法与新兴的生成对抗网络（GAN）统一，以捕获频道中的任何变化，并快速调整网络以维持模型的最佳性能。我们证明，我们的框架在高度动态的通道上显着优于最近的神经网络模型，甚至超过了我们实验中静态通道上的神经网络模型。

translated by 谷歌翻译

Learning Physics from the Machine: An Interpretable Boosted Decision Tree Analysis for the Majorana Demonstrator

I. J. Arnquist , F. T. Avignone III , A. S. Barabash , C. J. Barton , K. H. Bhimani , E. Blalock , B. Bos , M. Busch , M. Buuck , T. S. Caldwell

分类：机器学习

2022-07-21

Majorana示威者是一项领先的实验，寻找具有高纯净锗探测器（HPGE）的中性s中性双β衰变。机器学习提供了一种最大化这些检测器提供的信息量的新方法，但是与传统分析相比，数据驱动的性质使其不可解释。一项可解释性研究揭示了机器的决策逻辑，使我们能够从机器中学习以反馈传统分析。在这项工作中，我们介绍了Majorana演示者数据的第一个机器学习分析。这也是对任何锗探测器实验的第一个可解释的机器学习分析。训练了两个梯度增强的决策树模型，以从数据中学习，并进行了基于游戏理论的模型可解释性研究，以了解分类功率的起源。通过从数据中学习，该分析识别重建参数之间的相关性，以进一步增强背景拒绝性能。通过从机器中学习，该分析揭示了新的背景类别对相互利用的标准Majorana分析的重要性。该模型与下一代锗探测器实验（如传说）高度兼容，因为它可以同时在大量探测器上进行训练。

translated by 谷歌翻译

Fully Distributed Informative Planning for Environmental Learning with Multi-Robot Systems

Dohyun Jang , Jaehyun Yoo , Clark Youngdong Son , H. Jin Kim

分类：机器人

2021-12-29

本文提出了一种以完全分布式方式工作的协同环境学习算法。多机器人系统比单个机器人更有效，但它涉及以下挑战：1）使用多个机器人在线分布式学习环境地图; 2）基于学习地图的安全和有效的探索路径的产生; 3）对机器人数量的维持能力。为此，我们将整个过程划分为环境学习和路径规划的两个阶段。在每个阶段应用分布式算法并通过相邻机器人之间的通信组合。环境学习算法使用分布式高斯过程，路径规划算法使用分布式蒙特卡罗树搜索。因此，我们构建一个可扩展系统，而无需对机器人数量的约束。仿真结果证明了所提出的系统的性能和可扩展性。此外，基于实际数据集的仿真验证了我们算法在更现实的方案中的实用程序。

translated by 谷歌翻译